Fast and Adaptive Indexing of Multi-Dimensional Observational Data
نویسندگان
چکیده
Sensing devices generate tremendous amounts of data each day, which include large quantities of multi-dimensional measurements. These data are expected to be immediately available for real-time analytics as they are streamed into storage. Such scenarios pose challenges to state-of-the-art indexing methods, as they must not only support efficient queries but also frequent updates. We propose here a novel indexing method that ingests multi-dimensional observational data in real time. This method primarily guarantees extremely high throughput for data ingestion, while it can be continuously refined in the background to improve query efficiency. Instead of representing collections of points using Minimal Bounding Boxes as in conventional indexes, we model sets of successive points as line segments in hyperspaces, by exploiting the intrinsic value continuity in observational data. This representation reduces the number of index entries and drastically reduces “over-coverage” by entries. Experimental results show that our approach handles real-world workloads gracefully, providing both low-overhead indexing and excellent query efficiency.
منابع مشابه
یک روش مبتنی بر خوشهبندی سلسلهمراتبی تقسیمکننده جهت شاخصگذاری اطلاعات تصویری
It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...
متن کاملAn Adaptive Multi-level Hashing Structure for Fast Approximate Similarity Search
Fast information retrieval is an essential task in data management, mainly due to the increasing availability of data. To address this problem, database researchers have developed indexing techniques to logically organize elements from large datasets in order to answer queries efficiently. In this context, an approximate similarity search algorithm known as Locality Sensitive Hashing (LSH) was ...
متن کاملFast indexing for blocked array layouts to reduce cache misses
The increasing disparity between memory latency and processor speed is a critical bottleneck in achieving high performance. Recently, several studies have been conducted on blocked data layouts, in conjunction with loop tiling to improve locality of references. In this paper, we further reduce cache misses, restructuring the memory layout of multi-dimensional arrays, so that array elements are ...
متن کاملLightweight Indexing of Observational Data in Log-Structured Storage
Huge amounts of data are being generated by sensing devices every day, recording the status of objects and the environment. Such observational data is widely used in scientific research. As the capabilities of sensors keep improving, the data produced are drastically expanding in precision and quantity, making it a write-intensive domain. Log-structured storage is capable of providing high writ...
متن کاملAn Adaptive Physics-Based Method for the Solution of One-Dimensional Wave Motion Problems
In this paper, an adaptive physics-based method is developed for solving wave motion problems in one dimension (i.e., wave propagation in strings, rods and beams). The solution of the problem includes two main parts. In the first part, after discretization of the domain, a physics-based method is developed considering the conservation of mass and the balance of momentum. In the second part, ada...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 9 شماره
صفحات -
تاریخ انتشار 2016